Networked Restless Bandits with Positive Externalities
نویسندگان
چکیده
Restless multi-armed bandits are often used to model budget-constrained resource allocation tasks where receipt of the is associated with an increased probability a favorable state transition. Prior work assumes that individual arms only benefit if they receive directly. However, many occur within communities and can be characterized by positive externalities allow derive partial when their neighbor(s) resource. We thus introduce networked restless bandits, novel bandit setting in which both embedded directed graph. then present Greta, graph-aware, Whittle index-based heuristic algorithm efficiently construct constrained reward-maximizing action vector at each timestep. Our empirical results demonstrate Greta outperforms comparison policies across range hyperparameter values graph topologies. Code appendices available https://github.com/crherlihy/networked_restless_bandits.
منابع مشابه
Wireless Channel Selection with Restless Bandits
Wireless devices are often able to communicate on several alternative channels; for example, cellular phones may use several frequency bands and are equipped with base-station communication capability together with WiFi and Bluetooth communication. Automatic decision support systems in such devices need to decide which channels to use at any given time so as to maximize the long-run average thr...
متن کاملOpportunistic Scheduling as Restless Bandits
In this paper we consider energy efficient scheduling in a multiuser setting where each user has a finite sized queue and there is a cost associated with holding packets (jobs) in each queue (modeling the delay constraints). The packets of each user need to be sent over a common channel. The channel qualities seen by the users are time-varying and differ across users; also, the cost incurred, i...
متن کاملParticle Filtering And Restless Bandits 1 Running Head: PARTICLE FILTERS AND RESTLESS BANDITS Modeling Human Performance in Restless Bandits with Particle Filters
Bandit problems provide an interesting and widely-used setting for the study of sequential decision-making. In their most basic form, bandit problems require people to choose repeatedly between a small number of alternatives, each of which has an unknown rate of providing reward. We investigate restless bandit problems, where the distributions of reward rates for the alternatives change over ti...
متن کاملOn an Index Policy for Restless Bandits
We investigate the optimal allocation of effort to a collection of n projects. The projects are 'restless' in that the state of a project evolves in time, whether or not it is allocated effort. The evolution of the state of each project follows a Markov rule, but transitions and rewards depend on whether or not the project receives effort. The objective is to maximize the expected time-average ...
متن کاملRestless Bandits, Partial Conservation Laws and Indexability
We show that if performance measures in a general stochastic scheduling problem satisfy partial conservation laws (PCL), which extend the generalized conservation laws (GCL) introduced by Bertsimas and Niño-Mora (1996), then the problem is solved optimally by a priority-index policy under a range of admissible linear performance objectives, with both this range and the optimal indices being det...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i10.26415